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Abstract. Time evolution of macroscopic systems is re-examined primarily through fur- 
ther analysis and extension of the equation of motion for the density matrix p(t) . Because 
p contains both classical and quantum-mechanical probabilities it is necessary to account 
for changes in both in the presence of external influences, yet standard treatments tend 
to neglect the former. A model of time-dependent classical probabilities is presented to 
illustrate the required type of extension to the conventional time-evolution equation, and 
it is shown that such an extension is already contained in the definition of the density 
matrix. 

1. Introduction 

A principal tenet of statistical thermodynamics for over a century has been that the 
microscopic constituents of macroscopic systems obey the fundamental dynamical laws 
of physics, but today there is still no broad consensus as to exactly how this translates 
into the time evolution of the macroscopic system itself. At the heart of the difficulty is 
the fact that macroscopic systems require a probabilistic description, so that a primary 
concern must lie with the time development of probabilities themselves, and rarely has 
this concern been addressed from first principles. In a quantum description the problem 
is further exacerbated by the presence of two kinds of probabilities in the density matrix: 
one intrinsic to the underlying quantum mechanics, and another pertaining to incomplete 
information in the context of classical probability theory. The aim of the following dis- 
cussion is to disentangle these two contributions, at least conceptually, in a manner that 
leads to unambiguous equations of motion for macroscopic systems and at the same time 
clarifies the foundation of the latter in the microscopic laws. 

A classical description of a many-body system begins by introducing an ensemble of 
like systems along with a phase function / exhibiting their distribution in the phase space. 
We shall find it more convenient to employ a quantum mechanical description instead, not 
only because the mathematical expressions are a bit less unwieldy, but also because it allows 
us to focus more readily and naturally on the single system under study. For an isolated 
system ^ the standard stages in such a study are the following: (i) construct an initial 
density matrix p describing the initial macroscopic state; (ii) let the system evolve under 
its Hamiltonian H, thereby evolving the density matrix p by the deterministic microscopic 
equation of motion 

ihp = [H, P ], (i) 



t In many common cases the system can be merely closed, or even open, as long as fluctuations 
are small; the only requirement is that there be no net external forces. 
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where the superposed dot denotes a total time derivative; and, (iii) at time t use the 
time-evolved p{t) to predict the expectation value of a system variable C as 



(C(t)) = Tr[p(t)C] = Tr[p C(t)} . (2) 

This last expression illustrates the equivalence of the Schrodinger and Heisenberg pictures, 
for Eq.(l) itself is equivalent to a unitary transformation: 

p(t) = U(t)p uHt), (3) 
where the time-evolution operator U (t) is determined by 

with initial condition p(to) = po. For the moment we shall consider only the Schrodinger 
picture in which p(t) evolves in time; the density matrix is very definitely not a Heisenberg 
operator. If p is stationary, [H,p] = 0, it's a constant of the motion; if p is a functional 
only of constants of the motion the macroscopic state is one of thermodynamic equilibrium, 
corresponding to maximum entropy, and stage (ii) is solved immediately. 

It would appear, at least in principle, that proceeding through these three stages must 
lead to a complete description of time- varying processes in a macroscopic system in terms 
of the underlying microscopic dynamics. An exact solution of (1) is practicably out of 
the question for any macroscopic system with a nontrivial Hamiltonian, of course, so that 
many approximations have been pursued over the years. Classically the ensemble density of 
iV-particle systems, /jv(<7,P, t), where {q,p} represents the collection of all 6N coordinates 
of a system in phase space, satisfies the Liouville equation: df^ = {H,f^}, where the 
right-hand side is a Poisson bracket. Integration over all coordinates but those of a single 
particle yields the one-particle distribution /i, for which an explicit equation of motion 
at low densities is readily derived, the well-known Boltzmann equation. In a quantum 
mechanical formulation much effort has been devoted to deriving the so-called master 
equation for a coarse-grained probability distribution over system states (Pauli, 1928). As 
emphasized by van Kampen (1962), the solutions describe a single system and there is no 
need for the notion of ensembles in this approach. Other lines of attack include projection 
operator techniques (Zwanzig, 1961; Mori, 1965), and the notion from the Brussels school 
that there may be an intrinsic irreversibility within the microscopic equations themselves. 

All of these attempts at obtaining macroscopic equations of motion tend to create some 
sort of irreversibility, but it is difficult to establish whether it is the real thing or simply 
an artifact of the approximations. We now demonstrate, however, that even imagining we 
could solve (1) exactly leads us into serious difficulties. 

There are two simple applications of our 3-stage scenario that lead us directly to some 
disturbing questions concerning its general applicability. The first of these involves the 
response of the system to the presence of a well-defined external field. The effects of this 
field on the system can often be sensibly described by including an additional term in the 
Hamiltonian, as in 

H t = H -v(t)A, (5) 
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where v(t) describes the time dependence of the external force and A is a system variable 
(operator) coupling that force to the medium. A formal solution to (1) is given by 

% r f 

P(t) = Po+T U{t,t')v{t')[A,po]U\t,t')dt' , (6) 

and to is the time at which the external force is turned on. The interpretation here is that 
prior to to the system is in thermal equilibrium, and for t > to the density matrix evolves 
unitarily under the operator U (t, t') determined by (4) with Hamiltonian H t . At any later 
time the response of the system, described by the departure of expectation values of other 
operators C from their equilibrium values, is found by substitution of (6) into (2) to be 

(C(t)) - (C) = [ $Ac(t, t')v{t') dt' , (7) 

J to 

where ( )o denotes an equilibrium expectation value, and 

^ AC (t,t') = ^([A,C(t,t')}) (8) 

is called the dynamical response functional. The time dependence of C is given by C(t, t') = 
U\t,t')C(t')U{t,t'), which is effectively now a Heisenberg operator. 

Most often U is approximated by exp[— i(t — t')H /h], leading to the well-known 
linear response theory. Although this approximation has been criticized as inappropriately 
approximating the microscopic dynamics (e.g., van Kampen, 1971), this is really the least 
of the problems that arise. Unquestionably the exact time-evolved p(t) will predict the 
correct value of (C(t)) at time t > to, for both the quantum mechanics and associated 
mathematics are impeccable. But, as noted in (3), that time evolution is equivalent to a 
unitary transformation, under which the eigenvalues of p are unchanged. These eigenvalues 
are the probabilities for the system to be found in any of its possible macrostates under 
given macroscopic constraints, hence there would seem to be no macroscopic change in the 
system; but there is such a change, of course, as indicated in (7). The equivalent classical 
observation is that the Liouville equation moves the ensemble distribution around the phase 
space subject to given constraints, but does not alter those constraints. A further difficulty 
is that the von Neumann entropy S = — /cTr(plnp), where k is Boltzmann's constant, is 
also invariant under unitary transformation, indicating the absence of irreversible behavior 
during the process despite the possibility of energy having been added to the system 
throughout the time interval. 

A similar but more general application is the common task of heating a pot of water, 
on an electric stove, say. To describe this process in complete detail we have to specify the 
total system Hamiltonian Ho of the closed system consisting of water, pot, electric burner, 
and interactions among them: 

Ho = -ffwater + H pot + -^burner + -^int • (9) 

When the burner is turned on the voltage and current can be enfolded into an external 
contribution H ext (t) , which leads to a total Hamiltonian H tot = Hq + H ext for the heating 
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process. With the water initially in thermal equilibrium with the rest of the system at 
temperature T i; we know that the initial density matrix is given by the canonical distri- 
bution 

-Ho/kTi 

p(0) = — — , Z = Tre- Ho/kn . (10) 

Z 

If the burner is turned on for a period (0, £), then the density matrix for the isolated system 
at time t is obtained by unitary transformation, as above. Upon turning off the switch one 
expects the system to relax almost immediately to a final equilibrium state at temperature 
Tf, and the conventional teaching is that p(t) somehow goes over into the final canonical 
density matrix known to describe thermal equilibrium, 

-H /kT f 

p(t)—>Pf = —7 , Z f = Tre- H °'" r f. (11) 

But this cannot happen: because the eigenvalues of pf and p(t) are in general different, the 
two density matrices are incompatible; the eigenvalues of p(t) are just those of pi = p(0). 
Indeed, as in the previous example, the theoretical entropy of the final state is the same as 
that of the initial state, whereas we are certain that the initial and final measured entropies 
cannot be the same. Where is the irreversibility of this process to be found? We shall 
address that question in Part II of this discussion (Grandy, 2003; following paper, herein 
referred to as II) ; our task here is to sort out the details of the time evolution process per 
se in the presence of external influences. 

2. Origin of The Difficulties 

The source of the above difficulties is that the Hamiltonian governs only the micro- 
scopic behavior of the system. While there is little doubt that the p(t) evolved under H will 
make correct predictions of macroscopic expectation values, it does so by including only 
the local effects of the external forces, with no reference to either the external sources or the 
macroscopic constraints; and it is the changes in those constraints that constitute much of 
the thermodynamics. Time development by unitary transformation alone affects only the 
basic quantum mechanical probabilities, but not those of classical probability theory that 
are determined by the macroscopic constraints. Similarly, in the classical formulation the 
ensemble density is evolved by Liouville's equation and the meaning of the partial time 
derivative there is that an observer fixed in phase space would see the distribution move 
by without changing its shape. In either formulation the canonical microscopic equations 
of motion are ultimately responsible for the changing state of the system, to be sure, but 
both the impetus for these changes and the observed effects are macroscopic and must be 
included as such in any realistic analysis. 

To explore the origins of these matters more deeply and systematically it will be 
useful to return to the task of stage (i) mentioned earlier, and recall some of the details 
involved in constructing an initial density matrix, or probability distribution. t We adopt 



T For simplicity we shall consider a discrete set of states or alternatives, which is equivalent to 
employing a representation in which p is diagonal. The description then appears as independent 
of any particular physical application. 
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the view that the probability for any one in a set of n mutually exclusive and exhaustive 
alternatives {xi} is contingent on given information /, and will be written Pj = P(xi\I). 
As first developed by Gibbs (1902), and elucidated further by Shannon (1948) and Jaynes 
(1957), Pi is determined uniquely by maximizing the information entropy 

Si = -K^PilnPi, K = constant > , (12) 

i 

subject to constraints provided by I. The subscript I emphasizes that this theoretical 
entropy is defined in the context, and as a part, of probability theory. This is an important 
caveat, for otherwise it is easy to confuse Si with physical or thermodynamic entropy. In 
fact, merely by making this distinction we see that the invariance of the von Neumann 
entropy S = — /cTr(plnp) under unitary transformation is not as great a problem as first 
thought; it, too, should be considered an Si, and it is only its maximum subject to physical 
constraints that corresponds to thermodynamic entropy. Note that Si has the form of an 
expectation value, Si = — (InP), which is just the negative of Gibbs' 'average index of 
probability' that he minimized to define the equilibrium state. 

With the advent of the Shannon-Jaynes insights into construction of prior probabil- 
ities based on given evidence, the reasoning behind the Gibbs algorithm is immediately 
transparent. Given information in terms of a function f(x), such that x can take one, 
but only one, of the values {x^, the optimal choice of a probability distribution over 
{x^ is obtained by maximizing the entropy of the probability distribution (12) subject to 
constraints 

n 

^P = l, P = P(^|/)>0, (13a) 

n 

I: F=(f(x)) = J £Pif(xi)- (13b) 

i=l 

This is the principle of maximum entropy (PME), and in this form presumes the informa- 
tion to be given in the form of an expectation value. 

As is well known, the solution to this variational problem is most readily effected by 
the method of Lagrange multipliers, so that the desired probability distribution is given 
by 

Pi = l e - A/ ^ , Z{\) = e" A/(xi) , (14) 

i 

and the normalization factor Z is called the partition function. Substitution of Pi into 
(13b) provides the differential equation formally determining the Lagrange multiplier A: 

F={f) = -d\nZ/d\, (15) 

and the expected value for any other function g(x) is given by (g) = £V Pig(xi). It 
must be stressed that the expectation value on the left-hand side of (13b) is a known 
number F that we have identified in this way so as to incorporate the given information 
or data mathematically into a probability distribution. Whether we use one notation or 
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the other will depend on which feature of these data we wish to emphasize in a particular 
discussion. This scenario is readily generalized to include several pieces of data in terms 
of several functions {f r }, although it is also useful to note that not all these data need 
be specified at once. For example, a distribution can be constructed via the PME based 
on a datum (fi); subsequently further information may emerge in the form (^2), and the 
new distribution is obtained by re-maximizing Si subject to both pieces of data. If the 
new information contradicts the old there will be no solutions for real A2, and if the new 
datum is irrelevant it will simply drop out of the distribution. This procedure provides 
a method for incorporating new information into an updated probability estimate, in the 
spirit of Bayes' theorem in elementary probability theory. 

Maximization of Si over all probability distributions subject to given constraints trans- 
forms the context of the discussion into one involving the maximum as a function of the 
data specific to this application; it is no longer a functional of probabilities, for they have 
been 'integrated out'. To acknowledge this distinction we shall denote the maximum en- 
tropy by S'theor, and recognize that it is now a function only of the measured expectation 
values or constraint variables. That is, Stheor is the entropy of the macrostate, and the 
impetus provided by information theory is no longer relevant. What remains of the notion 
of information is now only to be found on the right-hand side of P(A\I); we are here apply- 
ing probability theory, not information theory. Substitution of (2-4) into (2-2) provides 
an explicit expression for the maximum entropy, 

S them = K In Z + KX(f). (16) 

The scenario described by Eq.(14) is precisely that leading to the canonical distribution 
(10) when the single piece of data involves the Hamiltonian, or total energy E{Ej), where 
{Ei} is the set of possible system energies. For constants of the motion, H in this case, the 
PME provides most of elementary classical thermodynamics and solves the tasks of stages 
(i) and (ii) in a single step. Stage (iii), of course, is very well developed mathematically 
for equilibrium systems, and we only note here that the expectation value minimizes the 
root- mean- square error in estimating /. 

When the specified functions or operators are not constants of the motion, or they 
vary in time, then p no longer commutes with H, Eq.(l) must be addressed explicitly, 
and we return to the conundrums raised above. Small changes in the problem defined by 
Eqs.(13) can occur through changes in the set of possible values {fi = f(xi)}, as well as 
from changes Spi in the assigned probabilities. A small change in the expectation value is 
then _ _ 

= , (17) 

i i 

and one readily verifies that the corresponding change in the information entropy is 

8S = S-S = J2 s Pi^Pi- ( 18 ) 

i 

The first sum on the right-hand side of (17) is just (Sf), the expected change in /, so we 
can rewrite that expression as 

8{f)-{6f) = 6Q f , (19) 
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where SQf = J2i$Pifi- 

Equation (19) can be interpreted as a generalized First Law of thermodynamics, which 
is suggested by taking / = E, the total system energy. In that case we can interpret (E) 
as the predicted internal energy U and, since SEi is the work done on the system when it 
is in state E i: it must be that SW = —5(E) is the predicted work done by the system. In 
this case, then, (19) has the form SU + SW = SQ, and SQ is unambiguously identified as 
heat. The latter is usually thought of as energy transfer through degrees of freedom over 
which we have no control, whereas work takes place through those we do control. But this 
is now seen as a special case of a more general rule in probability theory: a small change 
in any expectation value consists of a small change in the physical quantity ("generalized 
work") and a small change in the probability distribution ("generalized heat"). Just as 
with ordinary applications of the First Law, we see that the three ways to generate changes 
in any scenario are interconnected, and specifying any two determines the third. 

The important point for the current discussion is that SQf is effectively a source 
term, and it arises only from a change in the probability distribution. From (18) it is then 
clear that any change in the information entropy can only be induced by a SQ. Thus, 
since a unitary transformation corresponding to the time-evolution equation (1) leaves 
Si invariant, any complete description of the system evolution must contain some explicit 
reference to the sources inducing that evolution. A source serves to change the macroscopic 
constraints on the system, which the microscopic equations alone cannot do, and this can 
lead to changes in the maximum entropy. In the case of thermal equilibrium this is simply 
thermodynamics at work: a small change in (/) induced by a source SQf results in a new 
macroscopic state corresponding to a re-maximized entropy, a readjustment brought about 
by the underlying particle dynamics, of course. 

A deeper issue uncovered here has to do with the nature of probability itself. Many 
writers subscribe to the view that objects and systems possess intrinsic probabilities that 
are actually physical properties like mass or charge. While most physicists would surely 
reject such a stance, the idea often seems to lurk in the background of many probabilistic 
discussions. One consequence of this viewpoint is that one may be led to believe that a 
density matrix p(t) is a physically real quantity that's completely determined by the usual 
dynamical equations of motion, rather than representing a state of knowledge about a 
physical situation. This may work for an isolated system for which the only information 
available is that encoded in the initial value p(0), but we have seen above that this cannot 
be the case when external influences are operative. The probabilities (n\p\n) can change 
only if the information on which they are based changes. Thus, the resolution of the 
difficulties discussed above is to be found through re-examination of time evolution when 
changes in external constraints are explicitly taken into consideration. 

3. A Time-Dependent Probability Model 

The question of how to define time-dependent probabilities unambiguously is an old 
one, but no general theory seems to have been developed. In the real world any kind of 
change must have a physical origin, and so most expositions tend to focus on, or adapt 
physical equations of motion in one way or another to describe time varying probabilities. 
But this again risks viewing a probability as a physical object or property, rather than 
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a representation of a state of knowledge. While quantum mechanical probabilities are 
governed by microscopic physical laws, this is not necessarily the case for the macroscopic 
probabilities of interest here. The point of this section, then, is to develop a mathematical 
model that may provide some insight into the type of extension of the physical equations 
that we are seeking. 

Our problem is that of defining a time-dependent probability unambiguously. An 
understanding that all probabilities are conditional on some kind of information leads to 
the realization that P(Ai\I) can change in time only if the information / is changing in 
time, while the propositions Ai are taken as fixed. For example, if the trajectory y(t) of 
a particle is changing erratically owing to the presence of an unknown time-varying field, 
the allowed values of y do not change, but information on the current position and some 
time-dependent effects of that field might permit construction of probabilities for which 
values of y might be realized at some later time t. 

Armed with this insight the path to extending the PME algorithm in a straightforward 
manner is clear. We shall do this in steps by introducing an abstract probability model 
that avoids possible confusion with physical laws for the time being. Equations (13)-(15) 
summarized very briefly the maximum-entropy procedure for constructing a probability 
distribution based on a single piece of time-independent information. Unless there is a 
definite reason to suppose that the observed value of (f(x)) is unvarying, as is the case 
with equilibrium statistical mechanics where only constants of the motion are considered, 
there is little to persuade us that a subsequent observation will not produce a different 
value. One might, for example, harbor such thoughts while rolling a die made from a sugar 
cube. To generalize our model somewhat let us suppose that (f(x)) is observed at a series 
of times, and ultimately over a continuous time interval [to,t]. Since there is a piece of 
data given at every instant in the interval, there is likewise a Lagrange multiplier defined 
at each point as well. We thus accumulate a body of information that, in the same manner 
as above, leads to the maximum-entropy prescription 

Pi(T , r) = Z- 1 [A] exp A(0/*(0 dt'j 

z[A] = 5>xp{ f mn^dA 

Uto J (20) 

(/(M)) = J^tj lnz[mh r °^ r ' 

i 

Thus, Z is now a partition functional, X(t) a Lagrange-multiplier function defined only on 
the interval [to, t], and this function is determined through functional differentiation. 

Although nothing in (20) is to be considered explicitly time dependent, it is true that 
(f(x,t)), A(t), f(t) vary over the interval [to,t], but we know only that they do so there; 
indeed, X(t) is defined only on that interval. The meaning of (f(x, t)) here is that at each 
point of the interval we know a definite value of f(xi, t), and X(t) is determined such that 
the mass of the probability distribution resides squarely on these points over that interval. 
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This scenario is just a generalization of stage (i) considered earlier and does nothing more 
than provide an initial probability distribution at time t = r. 

Some further features of (20) are worthy of note, beginning with the observation 
that {Pi} goes over into the uniform distribution as r — > tq, as it should. If / ^ f(t) 
it can be removed from the integrals and Eqs.(20) reduce to the time-independent case. 
While physical processes must be causal, it can be shown (e.g., Jaynes, 1979) that logical 
inferences can propagate either forward or backward in time, as in geology and astrophysics, 
say. Thus, (g(x)) can not only be predicted for t > r, but also retrodicted for t < to. 
Generally, as t increases beyond r the accuracy of predictions made by this distribution 
can be expected to deteriorate continually, especially if / continues to vary; only new data 
can contribute to a better estimate of {Pi} at this point. 

In the previous model we thought of the data being collected over a definite time 
interval, after which we maximized the entropy. Having made this first generalization we 
can see at once the next step. Information gathered in one interval can certainly be followed 
by collection in another a short time later, and can continue to be collected in a series of 
such intervals, the entropy being re-maximized after each interval. Now let those intervals 
become shorter and the intervals between them closer together, so that by an obvious 
limiting procedure they all blend into one continuous interval whose upper endpoint is 
always the present moment. Thus, there is nothing to prevent us from imagining a situation 
in which our information or data are continually changing in time; weather forecasts come 
to mind. With experiments performed in a more controlled manner, such information can 
be specified in detail and sources turned on and off at will; a common example is the slow 
heating of that pot of water. 

The leap made here is to imagine re-maximization occurring at every moment, rather 
than all at once. There is no fundamental conceptual difference between the two scenar- 
ios, however, for in either case f(x,t) must be known on the set {x{{ during the basic 
information-gathering time interval. Yet, how do we justify the notion of continual re- 
maximization? The key point to realize is that there is no causal signal involved here, 
and no physical readjustment to be made. For any imaginable set of constraints there 
is a corresponding unique maximum entropy, just as for a first-order differential equation 
there is a unique solution for any given initial condition and it is unnecessary to actually 
solve the equation to know the solution exists. When you warm up your half-cup of coffee 
by pouring more in from the pot, you've just re-maximized the entropy of the coffee to 
conform to the new N and E. 

Without further ado, we now envision continuous data in the form of a time-varying 
expectation value, 

(f(x,t)) t = Y,fi(t)Pi(ro,t), m^ffat), pm^n^^t). (21) 

i 

That is, f(xi,t) is given at these points on [ro,t], and is specified to continue so until 
further notice. Then, in analogy to (20), 

Pi(t) = ^[Ajexp jjf' A(*')/i(*')<ft'} , (22a) 



9 



Z t [X] = ^exp | jf* A(f )/i(0 d*'} , (22b) 

*'))*' = ^) lnZ *' [ A ^')] ' ro - *' - 1 ' (22c) 
(</(M)>t = I>i(f)*(*)- (22d) 

i 

In these expressions the subscript t denotes expectation values computed with {Pj(£)}, and 
we note that the function g need not itself depend explicitly on time. Also, none of these 
quantities is necessarily a continuous function of x; rather, x simply denotes the sampling 
space for the discrete set {xi}. 

If (f(x, t)) is specified to be constant for all time, then t — > oo and we regain Eqs.(14). 
And once again the distribution is uniform at t = To, whereas if the specified time variation 
is halted at some time t = r Eqs.(20) are regained. 

The actual time variation of Pi(t) is described by 

d t Pi(t) = X(t)Afi(t)Pi(t) , (23a) 

where 

Afi(t) = fi(t)-{f(x,t)) t . (23b) 

We verify that this integrates back into the original Pi(t) by performing a functional 
integration in (22), and in doing so obtain the useful alternative expression 

Z t [X] =e^y\(t'){f(x,t')) t/ dt'^ . (24) 

Equation (23a) has the form of a 'master' equation that is often introduced into time- 
dependent scenarios; here it is exact. 

Direct differentiation in (22d) yields the 'equation of motion' 

d t {g(x,t)) t = Y^Pi(t)[gi(t) - X(t) 9i (t) Afrit)] 

i 

= (g(x,t)) t + X(t)K fg (x,t), (25) 
where we introduce the equal-time covariance function 

K gf (x, t) = K fg (x, t) = (g(x, t)f(x, t)) t - {g(x, t)) t (f(x, t)) t 

= 43$ij <»(*.«)>.■ (») 

Note that one can choose g = /; otherwise, g need not depend explicitly on t, in which 
case the first term on the right-hand side of (25) vanishes. While of formal interest, these 
equations of motion are somewhat redundant in view of (22d); in physical applications, 
however, they lead to several important insights. 
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Because {Pi} is now time dependent the information-theoretic entropy analogous to 
(16), and which was maximized continuously to obtain Eqs.(22), also depends on the time. 
There is no implied relation to the thermodynamic entropy, of course, so we introduce yet 
another entropic notation and write this functional as 

n 

H t [P(y)] = -Kj2Pi(t)^Pi(t), K>0. (27) 

i=i 

Upon maximization this depends on the initial data only, so is a functional of (f{x, t))t- 
Substitution from (22) then yields 

H t [(f(x,t))]=^Z t [X}- f \(t')(f(x,t')) t dt' 

= f \(t')(f(x,t')) t ,dt'- f \(t')(f(x,t')) t dt', (28) 

J To J To 



the integrands differing only in the subscripts. Note that a functional differentiation yields 
an alternative expression for A(t), 

£ TT 

and a time derivative provides a rate of 'entropy production' 

d t H t = -\(t) f X(t')K ff (t,t')dt' . (30) 

J To 



These equations are remarkably similar to many of those found in writings on ir- 
reversible thermodynamics {e.g., de Groot and Mazur, 1962). Though no application to 
physical models is made here, one recognizes the analogs of fluxes and forces, and Onsager- 
like reciprocity is immediately evident in (26). But no linear approximations are made in 
this model, so the current scenario is considerably more general. It must be emphasized 
once again, however, that the time dependencies derived above are based entirely on the 
supposition of information supplied in the form of specified time- varying expectation values 
or source functions; we actually know how f(x,t) varies in time, allowing us to predict the 
variation of g(x,t). A possible general application of these considerations might be made 
to driven noise, in which the noise amplitude varies in a known way. 

4. The Physical Problem 

Precisely how to adapt the preceding probability model to macroscopic systems will 
be taken up in Part II of this discussion, while here we shall complete the study of Eq.(l) 
and the fundamental time-evolution equation. If we believe that only an external source 
can produce changing macroscopic constraints and time- varying information i"(t), then p(t) 
must evolve in a manner over and above that determined by the Hamiltonian alone. In 
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fact, such an additional evolution is already implied by the density-matrix formalism, as 
we now demonstrate. 

In the equation of motion (1) for the density matrix we noted that the superposed dot 
represented a total time derivative. But in many works the equation is commonly written 

as 

ihd t p=[H,p], (31) 

where H may be time dependent. The standard argument is that the derivative in (31) is 
indeed a partial derivative because this expression is derived directly from the Schrodinger 
equation, which contains a partial time derivative, although it makes no difference in (31) 
since p depends only on t. This comment would not be notable were it not for an additional 
interpretation made in most writings on statistical mechanics, where p describes the entire 
macroscopic system. 

Equation (31) is often compared with the Heisenberg equation of motion for an oper- 
ator F(t) in the Heisenberg picture 

f = ^ F -v + d ' F ' < 32 » 

whereupon it is concluded from analogy with Liouville's theorem that dp/dt = and 
(31) is just the quantum mechanical version of Liouville's equation in classical statistical 
mechanics. But there is nothing in quantum mechanics that requires this conclusion, for 
pit) is not a Heisenberg operator; it is basically a projection operator constructed from 
state vectors, and in any event (31) refers to the Schrodinger picture. Heisenberg operators 
are analogous to functions of phase in classical mechanics, p is not. We shall argue here 
that the derivative in (31) should be considered a total time derivative, as asserted earlier 
for (1); this follows from a careful derivation of that equation. 

A density matrix represents a partial state of knowledge of a system. Based on that 
information we conclude that with probability w\ the system may be in a pure state 
ipi, or in state ^2 with probability W2, etc. Although the various alternatives ifti are not 
necessarily mutually orthogonal, they can be expanded in terms of a complete orthonormal 
set {u k }: 

V>i(r,£) = ^2a ik (t)u k (r) , (33) 

k 

such that (uk\uj) = S k j- The quantum mechanical expectation value of a Hermitian 
operator F in state ipi is 

(F) i = = Y, a ^mMF\u k ) • (34) 

k,n 

The expected value of F over all possibilities (in the sense of classical probability theory) 
is then 

(F) = Y t w i (F) i , Yl Wi = 1 - ( 35 ) 

i i 
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This last expression can be written more compactly (and generally) in matrix form as 

(F) = Tr(pF) , (36) 



where the density matrix p is defined in terms of its matrix elements: 

Pkn = a ktKi W i ■ ( 37 ) 

i 

This expression is equivalent to writing p as a weighted sum of projection operators onto 
the states ipf. p = J2i Wi\i>i){i>i\- 

To find an equation of motion for p we recall that each ipi must satisfy the Schrodinger 
equation ifidt^i = Hifji, and from (33) we find that this implies the equations of motion 

ihhij = ^2a ik H jk , H jk = (uj\H\u k ) . (38) 
k 

The superposed dot here denotes a total time derivative, for describes a particular state 
and depends only on the time. One thus derives an equation of motion for p by direct 
differentiation in (37), but this requires some prefatory comment. 

Usually the weights Wi are taken to be constants, determined by some means outside 
the quantum theory itself. In fact, they are probabilities and can be determined in principle 
from the PME under constraints representing information characterizing the state of the 
system. As noted earlier, however, if that information is changing in time, as with a 
nonequilibrium state, then the probabilities will also be time dependent. Hence, quite 
generally one should consider Wi = Wi (t) ; if such time dependence is absent we recover the 
usual situation. 

An equation of motion for p is now found in a straightforward manner, with the help 
of (37) and (38), by computing its total time variation: 

ihpkn = X (HkqPqn - H nq p kq ) + %h ^ Wia ki a ni , (39) 

q i 

or in operator notation 

ihp = [H, p] + ihd t p . (40) 

The term ihdtp is meant to convey only the time variation of the w^. 

Comparison of (40) with (32) confirms that the former is not a Heisenberg equation 
of motion — the commutators have opposite signs. Indeed, in the Heisenberg picture 
the only time variation in the density matrix is given by dtp, which arises qualitatively 
in the same general manner as P%(t) in the preceding probability model. If, in fact, the 
probabilities Wi are constant, as in equilibrium states, then (40) verifies (31) but with a 
total time derivative. Otherwise, (40) is the general equation of motion for the density 
matrix, such that the first term on the right-hand side describes the usual unitary time 
development of an isolated system. The presence of external sources, however, can lead 
to an explicit time dependence as represented by the second term, and thus the evolution 
is not completely unitary; classically, Liouville's theorem is not applicable. An additional 
source term of this kind also appears in the work of Zubarev, et al (1996), and in Kubo, 
et al (1985), but of a considerably different origin and unrelated to the basic probabilities. 
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5. Summary 

Equation (40) is the sought after extension to macroscopic systems of the equation of 
motion for the density matrix . The difference between this equation and the canonical 
version (1) is evident; it is the differences in their solutions that is most important. No 
matter what the solution to (1), it is always equivalent to a unitary transformation of the 
initial density matrix, and hence incapable of describing an irreversible process completely; 
some approximations of that equation, however, may exhibit various aspects of irreversible 
behavior. The p(t) evolved by (1) in conjunction with the total Hamiltonian is certainly a 
correct result of quantum mechanics, but from a macroscopic viewpoint it is incomplete; it 
contains no new macroscopic information about the processes taking place. It can predict 
changes in the macroscopic constraints, yet is not itself affected subsequently by those 
changes; it reflects changes in the quantum mechanical probabilities but not in the Wi 
in (37), which are determined by the external constraints. An example illustrating these 
points is given in II. 

To elaborate further on these differences, let us recall Boltzmann's enormously in- 
sightful relation between the maximum entropy and phase space volumes (or manifolds in 
a Hilbert space). In the form articulated by Planck this is 

S B = klnW, (41) 

where W is a measure of the set of microscopic states compatible with the macroscipic 
constraints on the system; it is a multiplicity factor. We emphasize that this expression is to 
be a representation of the maximum of the information entropy and, as Boltzmann himself 
observed, it is not restricted to equilibrium states. Although it sometimes is stated that 
(41) constitutes a proper definition of time-dependent entropy for nonequilibrium states 
(e.g, Lebowitz, 1999), there is no theoretical or mathematical basis for this assertion. 
Rather, Boltzmann's formulation — which can actually be expressed as a theorem (e.g., 
Grandy, 1980) — provides a deep physical explanation of what is achieved by maximizing 
the information entropy in the manner of Gibbs; namely, Sb characterizes that huge set 
of microstates compatible with the macroscopic constraints, each microstate contributing 
to W having probability roughly equal to W~ x . In addition, (41) also illustrates through 
Liouville's theorem on conservation of phase volume why the maximum entropy itself 
remains unchanged under canonical (or unitary) transformation; that is, no macroscopic 
information is either gained or lost during evolution under (1). Yet, (41) must change 
under the action of external forces; but how? 

Let us return to the scenario of heating a pot of water on an electric burner, where ini- 
tially the entire system is in equilibrium with multiplicity Wi. As energy AQ is added to the 
water its temperature rises and the number of macroscopic configurations corresponding 
to the new constraints increases enormously, until at some point the plug is pulled and the 
entire system relaxes to a final equilibrium state with multiplicity Wf ^> Wi. As a conse- 
quence, the entropy increases to Sb{ final) > Ssiinitial) , and this is completely equivalent 
to re-maximizing the information entropy subject to the constraint Ef = Ei + AQ. Note, 
however, that one can imagine carrying out such a re-maximization at any time during 
the process, for that maximization is also equivalent to acknowledging the existence of a 
definite phase volume of compatible microscopic states at any instant. 
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Thus, the multiplicity factor W increases to its final value owing to a change in the 
macroscopic constraint provided by the total energy. In turn, this can come about only 
by an evolution of the weights Wi in (39). Because the only time variation in p(t) in the 
Heisenberg picture is that of these weights, we suspect there may be more direct ways to 
determine the appropriate density matrix than by trying to solve an incredibly complex 
differential equation. After all, from a macroscopic standpoint we need only know AQ 
and the heat capacity of the water to predict its final temperature and energy. Explicit 
construction of p(t) and S(t) appropriate for describing nonequilibrium phenomena in these 
systems is carried out in Part II (following paper). 
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